47 research outputs found

    Trends in the Supreme Court: Mr. Jefferson’s Crumbling Wall-A Comment on Lynch V. Donnelly

    Get PDF
    This paper presents a novel learning algorithm that finds the linear combination of one set of multi-dimensional variates that is the best predictor, and at the same time finds the linear combination of another set which is the most predictable. This relation is known as the canonical correlation and has the property of being invariant with respect to affine transformations of the two sets of variates. The algorithm successively finds all the canonical correlations beginning with the largest one. It is shown that canonical correlations can be used in computer vision to find feature detectors by giving examples of the desired features. When used on the pixel level, the method finds quadrature filters and when used on a higher level, the method finds combinations of filter output that are less sensitive to noise compared to vector averaging

    Graph-based Neural Weather Prediction for Limited Area Modeling

    Full text link
    The rise of accurate machine learning methods for weather forecasting is creating radical new possibilities for modeling the atmosphere. In the time of climate change, having access to high-resolution forecasts from models like these is also becoming increasingly vital. While most existing Neural Weather Prediction (NeurWP) methods focus on global forecasting, an important question is how these techniques can be applied to limited area modeling. In this work we adapt the graph-based NeurWP approach to the limited area setting and propose a multi-scale hierarchical model extension. Our approach is validated by experiments with a local model for the Nordic region.Comment: 38 pages, 27 figures. Accepted to the Tackling Climate Change with Machine Learning workshop at NeurIPS 2023. Code available at: https://github.com/joeloskarsson/neural-la

    Reinforcement Learning and Distributed Local Model Synthesis

    No full text
    Reinforcement learning is a general and powerful way to formulate complex learning problems and acquire good system behaviour. The goal of a reinforcement learning system is to maximize a long term sum of instantaneous rewards provided by a teacher. In its extremum form, reinforcement learning only requires that the teacher can provide a measure of success. This formulation does not require a training set with correct responses, and allows the system to become better than its teacher. In reinforcement learning much of the burden is moved from the teacher to the training algorithm. The exact and general algorithms that exist for these problems are based on dynamic programming (DP), and have a computational complexity that grows exponentially with the dimensionality of the state space. These algorithms can only be applied to real world problems if an efficient encoding of the state space can be found. To cope with these problems, heuristic algorithms and function approximation need to be incorporated. In this thesis it is argued that local models have the potential to help solving problems in high-dimensional spaces and that global models have not. This is motivated with the biasvariance dilemma, which is resolved with the assumption that the system is constrained to live on a low-dimensional manifold in the space of inputs and outputs. This observation leads to the introduction of bias in terms of continuity and locality. A linear approximation of the system dynamics and a quadratic function describing the long term reward are suggested to constitute a suitable local model. For problems involving one such model, i.e. linear quadratic regulation problems, novel convergence proofs for heuristic DP algorithms are presented. This is one of few available convergence proofs for reinforcement learning in continuous state spaces. Reinforcement learning is closely related to optimal control, where local models are commonly used. Relations to present methods are investigated, e.g. adaptive control, gain scheduling, fuzzy control, and jump linear systems. Ideas from these areas are compiled in a synergistic way to produce a new algorithm for heuristic dynamic programming where function parameters and locality, expressed as model applicability, are learned on-line. Both top-down and bottom-up versions are presented. The emerging local models and their applicability need to be memorized by the learning system. The binary tree is put forward as a suitable data structure for on-line storage and retrieval of these functions

    Behavior representation by growing a learning tree

    No full text
    The work presented in this thesis is based on the basic idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is the generality ofsuchan approach, especially that the reinforcement learning paradigm allows systems to be designed which can improve their behavior beyond that of their teacher. The role of the teacher is to de ne the reinforcement function, which acts as a description of the problem the machine is to solve. Learning is considered to be a bootstrapping procedure. Fragmented past experience, of what to do when performing well, is used for response generation. The new response, in its turn, adds more information to the system about the environment. Gained knowledge is represented by a behavior probability density function. This density function is approximated with a number of normal distributions which are stored in the nodes of a binary tree. The tree structure is grown by applying a recursive algorithm to the stored stimuli-response combinations, called decisions. By considering both the response and the stimulus, the system is able to bring meaning to structures in the input signal. The recursive algorithm is rst applied t

    Reinforcement Learning and Distributed Local Model Synthesis

    No full text
    Reinforcement learning is a general and powerful way to formulate complex learning problems and acquire good system behaviour. The goal of a reinforcement learning system is to maximize a long term sum of instantaneous rewards provided by a teacher. In its extremum form, reinforcement learning only requires that the teacher can provide a measure of success. This formulation does not require a training set with correct responses, and allows the system to become better than its teacher. In reinforcement learning much of the burden is moved from the teacher to the training algorithm. The exact and general algorithms that exist for these problems are based on dynamic programming (DP), and have a computational complexity that grows exponentially with the dimensionality of the state space. These algorithms can only be applied to real world problems if an efficient encoding of the state space can be found. To cope with these problems, heuristic algorithms and function approximation need to be incorporated. In this thesis it is argued that local models have the potential to help solving problems in high-dimensional spaces and that global models have not. This is motivated with the biasvariance dilemma, which is resolved with the assumption that the system is constrained to live on a low-dimensional manifold in the space of inputs and outputs. This observation leads to the introduction of bias in terms of continuity and locality. A linear approximation of the system dynamics and a quadratic function describing the long term reward are suggested to constitute a suitable local model. For problems involving one such model, i.e. linear quadratic regulation problems, novel convergence proofs for heuristic DP algorithms are presented. This is one of few available convergence proofs for reinforcement learning in continuous state spaces. Reinforcement learning is closely related to optimal control, where local models are commonly used. Relations to present methods are investigated, e.g. adaptive control, gain scheduling, fuzzy control, and jump linear systems. Ideas from these areas are compiled in a synergistic way to produce a new algorithm for heuristic dynamic programming where function parameters and locality, expressed as model applicability, are learned on-line. Both top-down and bottom-up versions are presented. The emerging local models and their applicability need to be memorized by the learning system. The binary tree is put forward as a suitable data structure for on-line storage and retrieval of these functions

    Reinforcement Learning Adaptive Control and Explicit Criterion Maximization

    No full text
    This paper reviews an existing algorithm for adaptive control based on explicit criterion maximization (ECM) and presents an extended version suited for reinforcement learning tasks. Furthermore, assumptions under which the algorithm convergences to a local maxima of a long term utility function are given. Such convergence theorems are very rare for reinforcement learning algorithms working with continuous state and action spaces. A number of similar algorithms, previously suggested to the reinforcement learning community, are briefly surveyed in order to give the presented algorithm a place in the field. The relations between the different algorithms is exemplified by checking their consistency on a simple problem of linear quadratic regulation (LQR). 1 Introduction It is only recently that the close relationship between reinforcement learning and adaptive optimal control has been recognized [5]. An excellent overview and introduction to these relations is provided in the..

    Reinforcement Learning Adaptive Control and Explicit Criterion Maximization

    No full text
    This paper reviews an existing algorithm for adaptive control based on explicit criterion maximization (ECM) and presents an extended version suited for reinforcement learning tasks. Furthermore, assumptions under which the algorithm convergences to a local maxima of a long term utility function are given. Such convergence theorems are very rare for reinforcement learning algorithms working with continuous state and action spaces. A number of similar algorithms, previously suggested to the reinforcement learning community, are briefly surveyed in order to give the presented algorithm a place in the field. The relations between the different algorithms is exemplified by checking their consistency on a simple problem of linear quadratic regulation (LQR)

    Greedy adaptive critics for LPQ [dvs LQR] problems : Convergence Proofs

    No full text
    A number of success stories have been told where reinforcement learning has been applied to problems in continuous state spaces using neural nets or other sorts of function approximators in the adaptive critics. However, the theoretical understanding of why and when these algorithms work is inadequate. This is clearly exemplified by the lack of convergence results for a number of important situations. To our knowledge only two such results been presented for systems in the continuous state space domain. The first is due to Werbos and is concerned with linear function approximation and heuristic dynamic programming. Here no optimal strategy can be found why the result is of limited importance. The second result is due to Bradtke and deals with linear quadratic systems and quadratic function approximators. Bradtke's proof is limited to ADHDP and policy iteration techniques where the optimal solution is found by a number of successive approximations. This paper deals with greedy techniques, where the optimal solution is directly aimed for. Convergence proofs for a number of adaptive critics, HDP, DHP, ADHDP and ADDHP, are presented. Optimal controllers for linear quadratic regulation (LQR) systems can be found by standard techniques from control theory but the assumptions made in control theory can be weakened if adaptive critic techniques are employed. The main point of this paper is, however, not to emphasize the differences but to highlight the similarities and by so doing contribute to a theoretical understanding of adaptive critics
    corecore